Graph neural networks (GNNs) have recently emerged as a promising learning paradigm in learning graph-structured data and have demonstrated wide success across various domains such as recommendation systems, social networks, and electronic design automation (EDA). Like other deep learning (DL) methods, GNNs are being deployed in sophisticated modern hardware systems, as well as dedicated accelerators. However, despite the popularity of GNNs and the recent efforts of bringing GNNs to hardware, the fault tolerance and resilience of GNNs has generally been overlooked. Inspired by the inherent algorithmic resilience of DL methods, this paper conducts, for the first time, a large-scale and empirical study of GNN resilience, aiming to understand the relationship between hardware faults and GNN accuracy. By developing a customized fault injection tool on top of PyTorch, we perform extensive fault injection experiments to various GNN models and application datasets. We observe that the error resilience of GNN models varies by orders of magnitude with respect to different models and application datasets. Further, we explore a low-cost error mitigation mechanism for GNN to enhance its resilience. This GNN resilience study aims to open up new directions and opportunities for future GNN accelerator design and architectural optimization.
translated by 谷歌翻译
常规作品通常采用两阶段模型,其中生成器选择最重要的部分,然后是根据所选零件进行预测的预测因子。但是,这样的两相模型可能会引起变性问题,其中预测变量过度适合尚未训练的发电机生成的噪声,然后导致发电机收敛到倾向于选择无意义的碎片的亚最佳模型。为了应对这一挑战,我们提出了折叠的合理化(FR),将理由模型的两个阶段折叠成一个文本语义提取的角度。FR的关键思想是在发电机和预测器之间采用统一的编码器,基于FR可以通过访问传统两相模型中发电机阻止的有价值的信息来促进更好的预测指标,从而带来更好的生成器。从经验上讲,我们表明,与最先进的方法相比,FR将F1得分提高了10.3%。
translated by 谷歌翻译
在标签噪声下训练深神网络的能力很有吸引力,因为不完美的注释数据相对便宜。最先进的方法基于半监督学习(SSL),该学习选择小损失示例为清洁,然后应用SSL技术来提高性能。但是,选择步骤主要提供一个中等大小的清洁子集,该子集可俯瞰丰富的干净样品。在这项工作中,我们提出了一个新颖的嘈杂标签学习框架Promix,试图最大程度地提高清洁样品的实用性以提高性能。我们方法的关键是,我们提出了一种匹配的高信心选择技术,该技术选择了那些具有很高置信的示例,并与给定标签进行了匹配的预测。结合小损失选择,我们的方法能够达到99.27的精度,并在检测CIFAR-10N数据集上的干净样品时召回98.22。基于如此大的清洁数据,Promix将最佳基线方法提高了CIFAR-10N的 +2.67%,而CIFAR-100N数据集则提高了 +1.61%。代码和数据可从https://github.com/justherozen/promix获得
translated by 谷歌翻译
模型训练期间常见疾病和稀有疾病之间的数据失衡通常会导致智能诊断系统对常见疾病的预测有偏见。最先进的方法采用了两阶段的学习框架来减轻班级不平衡问题,其中第一阶段的重点是培训一般功能提取器,第二阶段的重点是对课堂的分类器负责人进行微调重新平衡。但是,现有的两阶段方法并不认为不同疾病之间的细粒度属性,通常导致第一阶段对医学图像分类的有效性低于自然图像分类任务。在这项研究中,我们建议将度量学习嵌入到两个阶段框架的第一阶段中,以帮助特征提取器学习提取更具歧视性特征表示。广泛的实验主要在三个医疗图像数据集上表明,所提出的方法始终优于现有的oneStage和两阶段方法,这表明可以将公制学习用作两阶段的插入式插件组件,用于两阶段的良好类粒度差异。图像分类任务。
translated by 谷歌翻译
不平衡的培训数据是医学图像分类的重大挑战。在这项研究中,我们提出了一个新型的渐进式中心三重态(PCCT)框架,以减轻类不平衡问题,尤其是用于诊断稀有疾病的问题,主要是通过仔细设计三重态采样策略和三重态损失形成。具体而言,PCCT框架包括两个连续的阶段。在第一阶段,PCCT通过类平衡的三重损失训练诊断系统,从而使不同类别的分布分布粗糙。在第二阶段,PCCT框架进一步改善了诊断系统,涉及三胞胎损失,从而导致每个类别的分布更紧凑。对于级别平衡的三重态损失,在每个训练迭代中为每个班级平均采样三重态,从而减轻了不平衡的数据问题。对于涉及三胞胎的集体中心损失,每个三重态中的正和负样本被其相应的类中心取代,该中心强制执行靠近类中心的同一类的数据表示。此外,涉及的三胞胎损失涉及的中心损失将扩展到成对的排名损失和四倍体损失,这证明了所提出的框架的概括。广泛的实验支持PCCT框架有效地用于医疗图像分类,并使用不平衡的训练图像。在两个皮肤图像数据集和一个胸部X射线数据集上,建议的方法分别获得了所有类别的平均F1得分86.2、65.2和90.66,以及81.4、63.87和81.92的稀有班级,即可实现最罕见的班级。性能并超越广泛使用的类不平衡问题的方法。
translated by 谷歌翻译
脑电图(EEG)的准确自动分析将在很大程度上有助于临床医生有效监测和诊断患有各种脑部疾病的患者。与使用标记的疾病脑电图数据进行监督的学习相比,可以训练模型以分析特定疾病但无法监测以前看不见的状态,仅基于正常脑电图的异常检测才能检测到新EEG中的任何潜在异常。与现有的异常检测策略不同,这些检测策略在模型开发过程中不考虑任何不可用的异常数据的财产,这里提出了一种面向任务的自我监督学习方法,它可以利用可用的正常脑电图和有关异常EEG的专业知识来培训更有效的EEG随后开发异常检测器的特征提取器。此外,具有较大核的特定两个分支卷积神经网络被设计为特征提取器,因此它可以更容易地提取较大规模和小规模的特征,这些特征通常出现在不可用的异常脑电图中。如三个EEG数据集所示,有效设计和训练的功能提取器已证明能够根据正常数据和未来的新EEG提取更好的特征表示,以根据正常数据和未来的异常检测来开发异常检测器。该代码可在https://github.com/irining/eeg-ad上找到。
translated by 谷歌翻译
Continually learning to segment more and more types of image regions is a desired capability for many intelligent systems. However, such continual semantic segmentation suffers from the same catastrophic forgetting issue as in continual classification learning. While multiple knowledge distillation strategies originally for continual classification have been well adapted to continual semantic segmentation, they only consider transferring old knowledge based on the outputs from one or more layers of deep fully convolutional networks. Different from existing solutions, this study proposes to transfer a new type of information relevant to knowledge, i.e. the relationships between elements (Eg. pixels or small local regions) within each image which can capture both within-class and between-class knowledge. The relationship information can be effectively obtained from the self-attention maps in a Transformer-style segmentation model. Considering that pixels belonging to the same class in each image often share similar visual properties, a class-specific region pooling is applied to provide more efficient relationship information for knowledge transfer. Extensive evaluations on multiple public benchmarks support that the proposed self-attention transfer method can further effectively alleviate the catastrophic forgetting issue, and its flexible combination with one or more widely adopted strategies significantly outperforms state-of-the-art solutions.
translated by 谷歌翻译
联合学习(FL)是以隐私性的方式从分散数据培训全球模型的重要范例。现有的FL方法通常假定可以对任何参与客户端进行培训。但是,在实际应用中,客户的设备通常是异质的,并且具有不同的计算能力。尽管像伯特这样的大型模型在AI中取得了巨大的成功,但很难将它们应用于弱客户的异质FL。直接的解决方案(例如删除弱客户端或使用小型模型适合所有客户端)将带来一些问题,例如由于数据丢失或有限的模型表示能力而导致的掉落客户端的代表性不足和劣等精度。在这项工作中,我们提出了一种包含客户的联合学习方法,以解决此问题。包容性FL的核心思想是将不同尺寸的模型分配给具有不同计算功能的客户,为功能强大的客户提供的较大模型以及针对弱客户的较小客户。我们还提出了一种有效的方法,可以在多个具有不同大小的本地模型之间共享知识。这样,所有客户都可以参与FL中的模型学习,最终模型可以足够大。此外,我们提出了一种动量知识蒸馏方法,以更好地转移强大客户的大型模型中的知识,向弱客户的小型模型。在许多实际基准数据集上进行的广泛实验证明了该方法在FL框架下使用异质设备的客户学习准确模型的有效性。
translated by 谷歌翻译
Partial label learning (PLL) is an important problem that allows each training example to be labeled with a coarse candidate set, which well suits many real-world data annotation scenarios with label ambiguity. Despite the promise, the performance of PLL often lags behind the supervised counterpart. In this work, we bridge the gap by addressing two key research challenges in PLL -- representation learning and label disambiguation -- in one coherent framework. Specifically, our proposed framework PiCO consists of a contrastive learning module along with a novel class prototype-based label disambiguation algorithm. PiCO produces closely aligned representations for examples from the same classes and facilitates label disambiguation. Theoretically, we show that these two components are mutually beneficial, and can be rigorously justified from an expectation-maximization (EM) algorithm perspective. Moreover, we study a challenging yet practical noisy partial label learning setup, where the ground-truth may not be included in the candidate set. To remedy this problem, we present an extension PiCO+ that performs distance-based clean sample selection and learns robust classifiers by a semi-supervised contrastive learning algorithm. Extensive experiments demonstrate that our proposed methods significantly outperform the current state-of-the-art approaches in standard and noisy PLL tasks and even achieve comparable results to fully supervised learning.
translated by 谷歌翻译
经常被描绘为定向的非循环图(DAG)的非环状模型已被广泛用于代表收集节点之间的定向因果关系。在本文中,我们提出了一种高效的方法来学习高尺寸案例中的线性非高斯表达,其中噪音可以是任何连续的非高斯分布。这与假设高斯噪声具有额外方差假设的高斯噪声以获得确切的DAG恢复的高斯噪声,这与大多数现有的DAG学习方法形成鲜明对比。该方法利用新颖的拓扑层概念来促进DAG学习。特别地,我们表明,拓扑层可以精确地以自下而上的方式重建,并且每个层中的节点之间的父子关系也可以一致地建立。更重要的是,拟议的方法不需要忠诚或父母的忠诚假设,这在DAG学习的文献中得到了广泛的假设。其优势也得到了各种模拟示例中的一些流行竞争对手的数值比较以及关于Covid-19的全球扩散的真实应用。
translated by 谷歌翻译